Simple, Eecient Shared Memory Simulations (extended Abstract)

نویسنده

  • Martin Dietzfelbinger
چکیده

We present three shared memory simulations on distributed memory machines (DMMs), which use universal hashing to distribute the shared memory cells over the memory modules of the DMM. We measure their quality in terms of delay, time-processor eeciency, memory contention (how many requests have to be satissed by one memory module per simulated step) and simplicity. Further we take into consideration diierent rules for resolving access connicts at the modules of the DMM, in particular the c-collision rule motivated by the idea of communicating between processors and modules using an optical crossbar. All simulations are very simple and deterministic (except for the random choice of the hash functions). In particular, we present the rst \deterministic" time-processor optimal simulations with delay O(log n), both on Arbitrary DMMs and 2-collision DMMs. (These models are deened in the paper .) The central idea for the latter simulation also yields a simple \deterministic" simulation of an n-processor PRAM on an n-processor 3-collision DMM with delay bounded by O(log log n) with high probability. For the time analysis of the simulations we utilize a new combinatorial lemma, which may be of independent interest. The lemma concerns events deened by properties of the color classes in random colorings of nite sets. Such events are not independent; the lemma shows that in an important special case such events are \negatively correlated", and thus, for the pupose of upper bounds on certain probabilities, may be treated as if independent.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Scalable Parallel Sparse Factorization with Left-right Looking Strategy on Shared Memory Multiprocessors 1 Scalable Parallel Sparse Factorization with Left-right Looking Strategy on Shared Memory Multiprocessors

An eecient sparse LU factorization algorithm on popular shared memory mul-tiprocessors is presented. Interprocess communication is critically important on these architectures-the algorithm introduces O(n) synchronization events only. No global barrier is used and a completely asynchronous scheduling scheme is one central point of the implementation. The algorithm aims at optimizing the single n...

متن کامل

A Concurrent Fast-fits Memory Manager University of Florida, Dept. of Cis Electronic Tr91-009

Shared memory multiprocessor systems need eecient dynamic storage allocators, both for system purposes and to support parallel programs. Most memory manager algorithms are based either on a free list, which provides eecient memory use, or on a buddy system, which provides fast allocation and release. In this paper, we present two versions of a memory manager based on the fast ts algorithm, whic...

متن کامل

A Scalable and Eecient Storage Allocator on Shared-memory Multiprocessors

An eecient dynamic storage allocator is important for time-critical parallel programs. In this paper, we present a fast and simple parallel allocator for xed size block on shared-memory multiprocessors. We show both theoretically and empirically that the allocator incurs very low lock contention. The allocator is tested with parallel simulation applications with frequent allocation and release ...

متن کامل

Data - parallel Implementationof

Parallel computers are rarely used to make individual programs execute faster, they are mostly used to increase throughput by running separate programs in parallel. The reason is that parallel computers are hard to program, and automatic parallelisation have proven diicult. The commercially most successful technique is loop parallelisation in Fortran. Its success builds on an elegant model for ...

متن کامل

Uppsala Theses in Computing Science 25 Data-parallel Implementation of Prolog Data-parallel Implementation of Prolog

Parallel computers are rarely used to make individual programs execute faster, they are mostly used to increase throughput by running separate programs in parallel. The reason is that parallel computers are hard to program, and automatic parallelisation have proven diicult. The commercially most successful technique is loop parallelisation in Fortran. Its success builds on an elegant model for ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1993